computer graphic
Learning to Predict Aboveground Biomass from RGB Images with 3D Synthetic Scenes
Forests play a critical role in global ecosystems by supporting biodiversity and mitigating climate change via carbon sequestration. Accurate aboveground biomass (AGB) estimation is essential for assessing carbon storage and wildfire fuel loads, yet traditional methods rely on labor-intensive field measurements or remote sensing approaches with significant limitations in dense vegetation. In this work, we propose a novel learning-based method for estimating AGB from a single ground-based RGB image. We frame this as a dense prediction task, introducing AGB density maps, where each pixel represents tree biomass normalized by the plot area and each tree's image area. We leverage the recently introduced synthetic 3D SPREAD dataset, which provides realistic forest scenes with per-image tree attributes (height, trunk and canopy diameter) and instance segmentation masks. Using these assets, we compute AGB via allometric equations and train a model to predict AGB density maps, integrating them to recover the AGB estimate for the captured scene. Our approach achieves a median AGB estimation error of 1.22 kg/m^2 on held-out SPREAD data and 1.94 kg/m^2 on a real-image dataset. To our knowledge, this is the first method to estimate aboveground biomass directly from a single RGB image, opening up the possibility for a scalable, interpretable, and cost-effective solution for forest monitoring, while also enabling broader participation through citizen science initiatives.
- North America > United States (0.93)
- Europe > Italy (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Massachusetts > Hampshire County > Northampton (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- (3 more...)
CMI-MTL: Cross-Mamba interaction based multi-task learning for medical visual question answering
Jin, Qiangguo, Zheng, Xianyao, Cui, Hui, Sun, Changming, Fang, Yuqi, Cong, Cong, Su, Ran, Wei, Leyi, Xuan, Ping, Wang, Junbo
Medical visual question answering (Med-VQA) is a crucial multimodal task in clinical decision support and telemedicine. Recent self-attention based methods struggle to effectively handle cross-modal semantic alignments between vision and language. Moreover, classification-based methods rely on predefined answer sets. Treating this task as a simple classification problem may make it unable to adapt to the diversity of free-form answers and overlook the detailed semantic information of free-form answers. In order to tackle these challenges, we introduce a Cross-Mamba Interaction based Multi-Task Learning (CMI-MTL) framework that learns cross-modal feature representations from images and texts. CMI-MTL comprises three key modules: fine-grained visual-text feature alignment (FVTA), cross-modal interleaved feature representation (CIFR), and free-form answer-enhanced multi-task learning (FFAE). FVTA extracts the most relevant regions in image-text pairs through fine-grained visual-text feature alignment. CIFR captures cross-modal sequential interactions via cross-modal interleaved feature representation. FFAE leverages auxiliary knowledge from open-ended questions through free-form answer-enhanced multi-task learning, improving the model's capability for open-ended Med-VQA. Experimental results show that CMI-MTL outperforms the existing state-of-the-art methods on three Med-VQA datasets: VQA-RAD, SLAKE, and OVQA. Furthermore, we conduct more interpretability experiments to prove the effectiveness. The code is publicly available at https://github.com/BioMedIA-repo/CMI-MTL.
- Research Report > New Finding (0.48)
- Research Report > Promising Solution (0.34)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Massachusetts > Hampshire County > Northampton (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- (3 more...)
Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs
The idea of computer vision as the Bayesian inverse problem to computer graphics has a long history and an appealing elegance, but it has proved difficult to directly implement. Instead, most vision tasks are approached via complex bottom-up processing pipelines. Here we show that it is possible to write short, simple probabilistic graphics programs that define flexible generative models and to automatically invert them to interpret real-world images. Generative probabilistic graphics programs consist of a stochastic scene generator, a renderer based on graphics software, a stochastic likelihood model linking the renderer's output and the data, and latent variables that adjust the fidelity of the renderer and the tolerance of the likelihood model. Representations and algorithms from computer graphics, originally designed to produce high-quality images, are instead used as the deterministic backbone for highly approximate and stochastic generative models.
Identification of Violin Reduction via Contour Lines Classification
Beghin, Philémon, Ceulemans, Anne-Emmanuelle, Glineur, François
The first violins appeared in late 16th-century Italy. Over the next 200 years, they spread across Europe and luthiers of various royal courts, eager to experiment with new techniques, created a highly diverse family of instruments. Around 1750, size standards were introduced to unify violin making for orchestras and conservatories. Instruments that fell between two standards were then reduced to a smaller size by luthiers. These reductions have an impact on several characteristics of violins, in particular on the contour lines, i.e. lines of constant altitude, which look more like a U for non reduced instruments and a V for reduced ones. While such differences are observed by experts, they have not been studied quantitatively. This paper presents a method for classifying violins as reduced or non-reduced based on their contour lines. We study a corpus of 25 instruments whose 3D geometric meshes were acquired via photogrammetry. For each instrument, we extract 10-20 contour lines regularly spaced every millimetre. Each line is fitted with a parabola-like curve (with an equation of the type y = alpha*abs(x)**beta) depending on two parameters, describing how open (beta) and how vertically stretched (alpha) the curve is. We compute additional features from those parameters, using regressions and counting how many values fall under some threshold. We also deal with outliers and non equal numbers of levels, and eventually obtain a numerical profile for each instrument. We then apply classification methods to assess whether geometry alone can predict size reduction. We find that distinguishing between reduced and non reduced instruments is feasible to some degree, taking into account that a whole spectrum of more or less transformed violins exists, for which it is more difficult to quantify the reduction. We also find the opening parameter beta to be the most predictive.
- Europe > Italy (0.24)
- Europe > Belgium > Wallonia > Walloon Brabant > Louvain-la-Neuve (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Graphics4Science: Computer Graphics for Scientific Impacts
Chen, Peter Yichen, Guo, Minghao, Pfister, Hanspeter, Lin, Ming, Freeman, William, Huang, Qixing, Shen, Han-Wei, Matusik, Wojciech
Computer graphics, often associated with films, games, and visual effects, has long been a powerful tool for addressing scientific challenges--from its origins in 3D visualization for medical imaging to its role in modern computational modeling and simulation. This course explores the deep and evolving relationship between computer graphics and science, highlighting past achievements, ongoing contributions, and open questions that remain. We show how core methods, such as geometric reasoning and physical modeling, provide inductive biases that help address challenges in both fields, especially in data-scarce settings. To that end, we aim to reframe graphics as a modeling language for science by bridging vocabulary gaps between the two communities. Designed for both newcomers and experts, Graphics4Science invites the graphics community to engage with science, tackle high-impact problems where graphics expertise can make a difference, and contribute to the future of scientific discovery. Additional details are available on the course website: https://graphics4science.github.io
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.06)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.04)
Navigating High-Dimensional Backstage: A Guide for Exploring Literature for the Reliable Use of Dimensionality Reduction
Jeon, Hyeon, Lee, Hyunwook, Kuo, Yun-Hsin, Yang, Taehyun, Archambault, Daniel, Ko, Sungahn, Fujiwara, Takanori, Ma, Kwan-Liu, Seo, Jinwook
Visual analytics using dimensionality reduction (DR) can easily be unreliable for various reasons, e.g., inherent distortions in representing the original data. The literature has thus proposed a wide range of methodologies to make DR-based visual analytics reliable. However, the diversity and extensiveness of the literature can leave novice analysts and researchers uncertain about where to begin and proceed. To address this problem, we propose a guide for reading papers for reliable visual analytics with DR. Relying on the previous classification of the relevant literature, our guide helps both practitioners to (1) assess their current DR expertise and (2) identify papers that will further enhance their understanding. Interview studies with three experts in DR and data visualizations validate the significance, comprehensiveness, and usefulness of our guide.
- North America > United States > California > Yolo County > Davis (0.14)
- Europe > Sweden > Östergötland County > Linköping (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Research Report (0.64)
- Overview (0.46)
- Questionnaire & Opinion Survey (0.46)
Evaluating LLMs for Visualization Tasks
Khan, Saadiq Rauf, Chandak, Vinit, Mukherjea, Sougata
Information Visualization has been utilized to gain insights from complex data. In recent times, Large Language Models (LLMs) have performed very well in many tasks. In this paper, we showcase the capabilities of different popular LLMs to generate code for visualization based on simple prompts. We also analyze the power of LLMs to understand some common visualizations by answering simple questions. Our study shows that LLMs could generate code for some visualizations as well as answer questions about them. However, LLMs also have several limitations. We believe that our insights can be used to improve both LLMs and Information Visualization systems.
Neural Path Guiding with Distribution Factorization
Figueiredo, Pedro, He, Qihao, Kalantari, Nima Khademi
In this paper, we present a neural path guiding method to aid with Monte Carlo (MC) integration in rendering. Existing neural methods utilize distribution representations that are either fast or expressive, but not both. We propose a simple, but effective, representation that is sufficiently expressive and reasonably fast. Specifically, we break down the 2D distribution over the directional domain into two 1D probability distribution functions (PDF). We propose to model each 1D PDF using a neural network that estimates the distribution at a set of discrete coordinates. The PDF at an arbitrary location can then be evaluated and sampled through interpolation. To train the network, we maximize the similarity of the learned and target distributions. To reduce the variance of the gradient during optimizations and estimate the normalization factor, we propose to cache the incoming radiance using an additional network. Through extensive experiments, we demonstrate that our approach is better than the existing methods, particularly in challenging scenes with complex light transport.
- Europe > Austria > Vienna (0.04)
- Asia (0.04)
- North America > United States > Texas (0.04)
- (2 more...)